Skip to content

Conversation

@agocke
Copy link
Member

@agocke agocke commented Jul 5, 2025

This pull request introduces several improvements to enhance .NET 8 and AOT (Ahead-Of-Time) compatibility, refactors type handling for dynamic payload parsing, and adds platform-specific attributes for better code safety. The most significant change is the systematic replacement of Type with a custom FetchType abstraction in dynamic event payload parsing, which improves performance and AOT support.

.NET 8 & AOT Compatibility Enhancements:

  • Added net8.0 as a target framework in FastSerialization.csproj and set <IsAotCompatible> and <PolySharpIncludeRuntimeSupportedAttributes> properties to improve AOT support. [1] [2]
  • Introduced the PolySharp package and corresponding configuration in both Directory.Packages.props and FastSerialization.csproj to provide polyfills for runtime attributes and enhance compatibility across target frameworks. [1] [2]

Dynamic Payload Parsing Refactor:

  • Refactored DynamicTraceEventParser to use a new FetchType abstraction instead of Type for payload type handling, improving performance and making the code more AOT-friendly. This includes changes to array handling, type conversions, and default value retrieval via the new FetchTypeHelpers utilities. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10]
  • Updated method signatures and logic to consistently use FetchType and improved error messages for unsupported types. [1] [2]

Platform and Trimming Safety Annotations:

  • Added [SupportedOSPlatform("windows")] to MemoryMappedFileStreamReader constructor and [RequiresUnreferencedCode] to DirectoryAnalyzerResolver to clarify platform and trimming requirements for AOT and static analysis. [1] [2]

Source Modernization and Minor Improvements:

  • Added missing using directives for System.Diagnostics.CodeAnalysis, System.Runtime.Versioning, and related namespaces to support new attributes and maintain code clarity. [1] [2] [3] [4] [5]
  • Updated usage of Marshal.SizeOf to the generic Marshal.SizeOf<T>() for clarity and type safety.

Other Cleanups:

  • Removed an unused private method from CtfChannel.cs to reduce code clutter.

These changes collectively modernize the codebase, making it more maintainable, performant, and compatible with newer .NET runtimes and deployment scenarios.

agocke added 2 commits July 4, 2025 22:14
Fixes up simple annotation issues and APIs that have an easy AOT-safe
alternative.
@agocke
Copy link
Member Author

agocke commented Jul 5, 2025

cc @hoyosjs @tommcdon I think this is needed for making dotnet-gcdump AOT-compatible

@brianrob
Copy link
Member

brianrob commented Jul 6, 2025

I've had a chance to look through this PR. I'd like to have a conversation about what the goal is here and have some follow-up questions on several of the changes. My initial thought is that this change brings in several things that I don't want to maintain and/or I don't believe will work for TraceEvent all-up. To preview a few things:

  • I really do not want to add a new TFM for these libraries. We used to target multiple TFMs and this was much more complex for a variety of reasons.
  • I will need help understanding the requirement for PolySharp and how this impacts runtime execution. PerfView embeds its own dependencies, and so we'd need to understand the impact here.
  • Primitive-only deserialization will not work - This will break arbitrary trace files.

I will likely have more questions if we can resolve these issues, but these are the big rocks.

int bytes = _ctfStream.EventHeader.GetSize();
}

private bool ReadStruct<T>(out T result) where T : struct
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method appears unused

{
_log.WriteLine("Trying to generate the file {0}.", target);
var toolsDir = Path.GetDirectoryName(typeof(SourceFile).GetTypeInfo().Assembly.ManifestModule.FullyQualifiedName);
var archToolsDir = Path.Combine(toolsDir, NativeDlls.ProcessArchitectureDirectory);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This variable looks unused

<ItemGroup>
<PackageReference Include="Microsoft.Diagnostics.Tracing.TraceEvent.SupportFiles" Version="$(MicrosoftDiagnosticsTracingTraceEventSupportFilesVersion)" />
<PackageReference Include="Microsoft.Win32.Registry" Version="$(MicrosoftWin32RegistryVersion)" />
<PackageReference Include="PolySharp" Version="1.14.1">
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This package polyfills attributes for netstandard2.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use latest version 1.15.0? Any breaking change that prevents usage?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably works fine? I just did nuget add package and it did this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wired that this command adds on older version and not the most recent one. Try the update and look what happens.

#if NET
manifestStream.ReadExactly(serializedManifest, 0, len);
#else
manifestStream.Read(serializedManifest, 0, len);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is almost certainly a bug

@agocke
Copy link
Member Author

agocke commented Jul 6, 2025

I'd like to have a conversation about what the goal is here

The main goal is to support Native AOT for a number of diagnostic tools that use this library. Most choices are downstream of that goal:

I really do not want to add a new TFM for these libraries.

There isn't really a choice here. netstandard2.0 isn't annotated for trimming/AOT, so in order to see the incompatibilities you have to multi-target.

I will need help understanding the requirement for PolySharp and how this impacts runtime execution. PerfView embeds its own dependencies, and so we'd need to understand the impact here.

Polysharp just adds attributes so we can avoid ifdef'ing every usage of RequiresUnreferencedCode or SupportedOS. It shouldn't have any runtime impact.

Primitive-only deserialization will not work - This will break arbitrary trace files.

This is the key part of a "feature switch" -- it changes behavior when intent to trim (via PublishAot/PublishTrimmed/IsTrimmable/IsAotCompatible) is enabled. This will produce different behavior when trimming, but the old behavior is categorically impossible to support when trimming, so it's kind of the best middle-ground on offer. The alternative is to simply disallow use of the API entirely, which is problematic because it is pervasive in this library.

@brianrob
Copy link
Member

brianrob commented Jul 7, 2025

The main goal is to support Native AOT for a number of diagnostic tools that use this library. Most choices are downstream of that goal:

I think that we should get together to discuss the goals and priority here. As it stands, I am not inclined to take this change - this puts a burden on TraceEvent and its maintainers that isn't free and creates another row in the test matrix (AOT). I think we need to understand the business need here.

@agocke
Copy link
Member Author

agocke commented Jul 7, 2025

@tommcdon @steveisok @noahfalk can you provide some guidance here? My understanding was that it’s a long-term goal to move the diagnostics tools to aot to eliminate the dependency on a shared runtime in diagnostics collection scenarios. Should trace event be part of that scenario, or is removing the dependency an option?

@noahfalk
Copy link
Member

noahfalk commented Jul 8, 2025

@tommcdon @steveisok @noahfalk can you provide some guidance here? My understanding was that it’s a long-term goal to move the diagnostics tools to aot to eliminate the dependency on a shared runtime in diagnostics collection scenarios. Should trace event be part of that scenario, or is removing the dependency an option?

I'd agree its a long-term goal. So far it hasn't been high enough priority that we've put any effort into it and I wasn't thinking of it as pressing. Is there anything creating urgency or could we just park this and revisit when converting the tools is a higher priority?

In terms of the dependency I think we have a mixture of:

  • some tools don't use TraceEvent at all
  • some tool features need potentially constrained subsets like NetTrace parsing and gcdump formatting
  • some tool features like 'dotnet-trace report topN' probably have much broader code usage within TraceEvent. We could potentially cut or modify features like this to remove the dependency but that has user impacts and development cost.
    It would probably be hard to remove the dependency if we want to keep all the features.

@agocke
Copy link
Member Author

agocke commented Jul 8, 2025

I'm skeptical that there's going to be a future time where we have more time and resources to devote to this compared to now.

Can you say a bit more about what you would expect to change in the future? What new information do we need? What kind of thing would change that would shift our decision?

@agocke
Copy link
Member Author

agocke commented Jul 8, 2025

One more thing: almost every library in .NET multi-targets. It is a recommended strategy for most of our modern .NET scenarios and will likely continue to be popular far into the future. I don't see a reason why EventTrace would be different in this regard. What makes it special compared to every other .NET library?

@noahfalk
Copy link
Member

noahfalk commented Jul 8, 2025

I'm skeptical that there's going to be a future time where we have more time and resources to devote to this compared to now.

Can you say a bit more about what you would expect to change in the future? What new information do we need? What kind of thing would change that would shift our decision?

I largely think of it as priority and available devs to work on it. Andy, was it your plan that you were going to convert all the dotnet-* tools to NativeAOT yourself? If yes then we should try to figure this out now. If not and it fell to the diagnostics team to do the conversion then I am guessing it wouldn't happen until either:

  • priority of converting the tools went up (more customer dissat about acquisition experience or some broader scenario requires it)
  • other work priorities go down/get completed (SFI, engineering, multiple runtime convergence efforts, playing catchup to mandatory runtime features, new diagnostic tools and features, numerous bugs)

Of course if there is something I'm overlooking that makes the conversion more urgent/valuable than I realize do let me know. So far it just wasn't looming particularly large on my radar.

@agocke
Copy link
Member Author

agocke commented Jul 8, 2025

Andy, was it your plan that you were going to convert all the dotnet-* tools to NativeAOT yourself?

I was going to see how much work it would be. I started with dotnet-gcdump. Looks like this was the only blocker -- after I fixed TraceEvent I think dotnet-gcdump just works. I haven't yet looked at the rest of the tools

@brianrob
Copy link
Member

brianrob commented Jul 9, 2025

@noahfalk and I chatted about this offline yesterday. When I look at this change, I am most worried about it from the perspective of support cost. That's both in terms of additional development-time and testing complexity (e.g. new TFM) and behavior complexity (e.g. different behavior across different runtimes).

As a path forward, I think that I would like to step back and understand what the various options are for carving out the set of functionality that is capable of being AOT'd. My read of the current PR (I could be wrong), is that everything would be available for use in NativeAOT, though some behavior would be different. I'd like to take a different approach where we assert that all APIs available in NativeAOT have consistent behavior with non-AOT usage. If that's not possible for an API, then by default, I'd like it to not be available for usage in NativeAOT'd apps. This may or may not produce an issue for the current scenario (diagnostic tools), but I'd much rather have a conversation and find a path forward, knowing that we have API behavior consistency.

I would like to start this exercise by assuming that no APIs are available for NativeAOT, and then let's bring in the minimal set required for the diagnostic tools. As that's happening, if there are APIs that can't be supported, that's when we have a discussion about how to move forward. I can imagine potentially building new APIs that are more NativeAOT friendly so that we can be very clear about what works and what doesn't, and we'd be forcing folks that want to build NativeAOT'd apps that use TraceEvent to make changes to their code when an API that they currently use doesn't work with NativeAOT. I imagine that we have a good amount of experience doing this kind of work, and so I'm hoping that we won't be re-inventing the wheel here.

Regarding adding a new TFM, I'd like to understand what other options exist here if any. Is the issue that we're not targeting netstandard21, or just that no version of netstandard is sufficient for NativeAOT? I'm thinking that if we need to add a new TFM, I'm not sure if I prefer to just add the TFM to TraceEvent, or if I'd rather create a new DLL that contains the AOT'able subset of TraceEvent and allow folks to consume that.

I'm sure that we can find a good path forward here so that we end up in the right place. If we aren't sure how far we're going to get with converting the diagnostic tools to NativeAOT, then it might also be worth sticking with what you have for now, continuing down the path of converting the tools against a private branch of TraceEvent, and if all goes well, then we can come back to how we implement NativeAOT in TraceEvent. If that's too burdensome, we can certainly start to delve into the answers to these questions now.

@agocke
Copy link
Member Author

agocke commented Jul 25, 2025

OK, I've refactored this to completely remove the usage of System.Type. @noahfalk you were right. There was no dynamic usage of Type in this code and it could be completely replaced with simple switch statements.

@agocke
Copy link
Member Author

agocke commented Nov 22, 2025

@noahfalk @brianrob I think this is ready to go. I've narrowed any breaking change to two internal helper methods (GetInt16At and GetByteAt) that now return short and byte, respectively, instead of int. That's needed to avoid annoying reflection.

Even separate from trimming/AOT I think this will be faster and safer due to the removal of reflection.

@brianrob
Copy link
Member

@agocke, thanks. I will re-review this. One thing that I noticed is that this change has a very large number of whitespace-only changes. Can I ask you to revert these so that it's easier to review this? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants